Vocabulary-independent recognition of american Spanish phrases and digit strings

نویسندگان

  • Yeshwant K. Muthusamy
  • John J. Godfrey
چکیده

We describe the development of an R&D recognizer for several Spanish applications, starting from an existing recognition system for American English and modest language-speci c resources. The experiments emphasize achieving phonetic accuracy on telephone speech without vocabulary speci c training. We use our basic recognition engine, and simple grammar-building tools for predicting word sequences. Only the read sentences from two telephone speech corpora (Voice Across Hispanic America (VAHA) and a smaller TI corpus) are used for training. Word error rates (WER) of 1.9% on telephone service command phrases, 5.5% on telephone numbers, and 12% on continuously spoken sentences are achieved with the newly ported system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The CSLU speaker recognition corpus

This paper describes the CSLU Speaker Recognition Corpus data collection. The corpus was motivated by a need for speech data from many speakers, under different environmental conditions, with each speaker providing data over a significant period of time. The corpus was designed to provide sufficient data to study phonetic variability within and across sessions, and to design and evaluate system...

متن کامل

An embedded word training procedure for connected digit recognition

The "conventional" way of obtaining word reference patterns for connected word recognition systems is to use isolatàd word patterns, and to rely on the dynamics of the matching algorithm to account for the differences in connected speech. Connected word recognition, based on such an approach, tends to become unreliable (high error rates) when the talking rate becomes grossly incommensurate with...

متن کامل

DNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances

We investigate how to improve the performance of DNN ivector based speaker verification for short, text-constrained test utterances, e.g. connected digit strings. A text-constrained verification, due to its smaller, limited vocabulary, can deliver better performance than a text-independent one for a short utterance. We study the problem with “phonetically aware” Deep Neural Net (DNN) in its cap...

متن کامل

Recognition of digit strings in noisy speech with limited resources

Automatic recognition of continuously-spoken digits (e.g., telephone numbers or credit card numbers) is feasible with excellent accuracy, even for speaker-independent applications over telephone lines. However, even such relatively simple recognition tasks su er decreased performance in adverse conditions, such as signi cant background noise or fading on portable telephone channels. If an appli...

متن کامل

Investigations on discriminative training criteria

In this work, a framework for efficient discriminative training and modeling is developed and implemented for both small and large vocabulary continuous speech recognition. Special attention will be directed to the comparison and formalization of varying discriminative training criteria and corresponding optimization methods, discriminative acoustic model evaluation and feature extraction. A fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997